cloudv2: Higher resolution for Histograms #3302

codebien · 2023-08-24T09:01:16Z

What?

It expands the resolution for Histograms, setting 0.001 as the minimum resolution.

Why?

It allows us to support decimal numbers up to 3 digits, increasing the detail of the values that we can store.

CLAassistant · 2023-08-24T09:01:22Z

All committers have signed the CLA.

codecov-commenter · 2023-08-24T09:11:11Z

Codecov Report

Merging #3302 (180fa7c) into master (d5b28fb) will decrease coverage by 0.02%.
Report is 2 commits behind head on master.
The diff coverage is 84.21%.

❗ Current head 180fa7c differs from pull request most recent head a7dec90. Consider uploading reports for the commit a7dec90 to get more accurate results

@@            Coverage Diff             @@
##           master    #3302      +/-   ##
==========================================
- Coverage   73.21%   73.19%   -0.02%     
==========================================
  Files         258      258              
  Lines       19886    19895       +9     
==========================================
+ Hits        14559    14563       +4     
- Misses       4404     4408       +4     
- Partials      923      924       +1

Flag	Coverage Δ
ubuntu	`73.13% <84.21%> (-0.03%)`	⬇️
windows	`73.07% <84.21%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed	Coverage Δ
js/modules/k6/grpc/client.go	`84.35% <62.50%> (-0.34%)`	⬇️
output/cloud/expv2/hdr.go	`100.00% <100.00%> (ø)`

... and 1 file with indirect coverage changes

mstoykov

LGTM in general 👍

I left some not blocking comments.

But I would prefer if we get some tests with values between 0 and 1 which is what this was about. Maybe even with some values that the browser team think are representable.

output/cloud/expv2/hdr.go

mstoykov · 2023-08-25T07:37:08Z

output/cloud/expv2/hdr.go

@@ -151,6 +164,9 @@ func histogramAsProto(h *histogram, time int64) *pbcloud.TrendHdrValue {
 	if h.ExtraHighBucket > 0 {
 		hval.ExtraHighValuesCounter = &h.ExtraHighBucket
 	}
+	// We don't expect to change the minimum resolution at runtime
+	// so a pointer is safe here
+	hval.MinResolution = &h.MinimumResolution


Is there are reason to use a pointer though?

It won't save memory and will add dereferencing the point when we have to actually write it.

I guess the problem is that this is because in the protobuf definition this is not required 🤦

Not certain we can do anythign about this 🤷

Yeah, the reason is how Protobuf defines optional values. We haven't released yet the new cloud output v2 as the default so we could think of migrating directly to a new Protobuf version making it required. But my feeling is that it would require a maintenance window.

I believe that if we un-optional this field, this doesn't really change anything for the backend?

@esquonk if the suggestion is to change it only on the client then it would require having different proto file versions between the backend and client, that doesn't sound optimal to me.
Instead, if you mean for both of them, then In that case, the problem is how to handle the versioning. If we set the field on the backend as required then the already k6 v0.46.0 released version will stop working if we force the K6_CLOUD_API_VERSION=2 overwriting it from the backend. We could consider it as a no-problem as we are pretty confident that we will set it directly from the code releasing v0.47.0, but I'm not sure. Thoughts?

olegbespalov

LGTM, but joining to Mihail on his comments 👍

output/cloud/expv2/hdr_test.go

mstoykov

LGTM!

Do we have any statistics on how much this change makes changes over 1 less accurate?

I guess it will be fairly good to have some kind of graph showing how accurate the samples are between the values 0 and ... something Less say whatever 1 minute is in ms so 1 *60 * 100 = 6000?

mstoykov · 2023-08-25T11:40:37Z

That graph is mostly interesting for the release notes, but it will still be good to have it ealier IMO.

codebien · 2023-08-28T10:51:50Z

We should not lose any precision because what we are doing is just using a factor for moving the values across the histogram. The relative error rate is generated from the number of buckets that we have in the range, and it is a constant value. In our case the error rate is ~1%.

Formula Precision =  100% / (number of sub-buckets in a major bucket) = 100 / 2^7 = 0.78

So the boundaries of each value are just upscaled for storage and downscaled for query but they will be the same.

Resolution	Seconds (v/min_res)	In Millis (v*1e3)	Lower limit [ millis - (millis * 1%) ]+	Upper limit [ millis + (millis * 1%) ]
1.0	60	60k	59.4k	60.6k
0.001	60k	60M	59.4M	60.6M

@mstoykov does it make sense for you?

mstoykov · 2023-08-29T12:44:18Z

@codebien it makes some sense ... but we still change the number of buckets.

From my experiments it is barely noticeable in the values we have.

func TestSequence(t *testing.T) {
	resolution := float64(0.001)
	first := float64(50_000)
	last := resolveBucketIndex(float64(first) / resolution)
	first_last := last
	max := float64(60_000)
	increment := float64(1)
	for i := first; i <= max; i += increment {
		index := resolveBucketIndex(float64(i) / resolution)
		if last != index {
			// fmt.Println(last, index, i)
			last = index
		}
	}
	fmt.Printf("buckets total in range %f to %f with resolution %f are %d\n", first, max, resolution, last-first_last)
}

For example will result in

buckets total in range 50000.000000 to 60000.000000 with resolution 0.001000 are 38

vs

buckets total in range 50000.000000 to 60000.000000 with resolution 1.000000 are 39

So 1 bucket more - so very small reduction of accuracy in the 50 to 60 seconds. If I go in bigger ranges the difference is once again 1-2 either way, so it seems okay.

Obviously if you go from 0 to 1 the difference is huge :

buckets total in range 0.000000 to 1.000000 with resolution 0.001000 are 506

vs

buckets total in range 0.000000 to 1.000000 with resolution 1.000000 are 1

I think we can tl;dr it to "there is no lost of resolution in the values above 1" while there is significant improvement in the ones between 0 and 1

edit: p.s. the code was added to hdr_test.go as the easiest way to use the bucket index generation fucntion without exporitng it and using it in a main package.

output/cloud/expv2/hdr.go

codebien · 2023-08-30T16:35:23Z

Hey @olegbespalov @mstoykov I've pushed a new fixup for fixing #3302 (comment). It was a mistake because we are checking the max value after that the resolution is applied, so the value will be checked already as the final value.

Can you take another look, please?

olegbespalov

Generally LGTM 👍

But I prefer to do a clean-up of the tests (mentioned in the comments) in the current PR to avoid future confusion.

However, I don't know how urgent is merging this, so approving it.

olegbespalov · 2023-08-31T08:32:52Z

output/cloud/expv2/hdr_test.go

 			h := newHistogram()
+			// TODO: refactor
+			// An hack for preserving as the default for the tests the old value 1.0


TBH, it seems right now is the best moment to refactor this place to avoid future confusion.

olegbespalov · 2023-08-31T08:33:59Z

output/cloud/expv2/hdr_test.go

 			for _, v := range tc.vals {
 				h.Add(v)
 			}
+			tc.exp.MinimumResolution = 1.0


nit: it's better to keep this along with the other expectations

olegbespalov · 2023-08-31T08:34:46Z

output/cloud/expv2/hdr_test.go

@@ -161,57 +180,84 @@ func TestHistogramAddWithMultipleOccurances(t *testing.T) {
 		Sum:             466.20000000000005,
 		Count:           5,
 	}
+	exp.MinimumResolution = 1.0


why not define it in the struct?

olegbespalov · 2023-08-31T08:36:02Z

output/cloud/expv2/hdr_test.go

+		Count:             3,
+		MinimumResolution: 1.0,
+	}
+	h.MinimumResolution = 1.0


Isn't it already done on the line 210? (straight after h := newHistogram()) 🤔

codebien · 2023-08-31T10:12:12Z

Hey @olegbespalov, we decided to make the min_resolution required instead of optional, so I need to rework those tests anyway. I will address your valuable suggestions directly in the new PR (that I will push today, maximum tomorrow). I would prefer to not block this PR so I can unlock the browser for testing the full picture.

I will reference the comments here directly in the new PR.

olegbespalov · 2023-08-31T10:21:35Z

@codebien Sure, go ahead and merge it 👍

We increased the resolution for the histogram using a different min_resolution value. It increases the range of supported values. Now, it supports decimal values between 0 and 1. min_resolution is a factor that scales the ingested value, in this way, with the set value of 0.001, it supports 3 decimal digits where before there was only the option for aggregating integer values.

codebien self-assigned this Aug 24, 2023

codebien force-pushed the cloudv2-hdr-higher-res branch 2 times, most recently from 0a58f7c to 0a0c400 Compare August 24, 2023 14:51

codebien added this to the v0.47.0 milestone Aug 24, 2023

codebien requested a review from mstoykov August 24, 2023 14:55

codebien marked this pull request as ready for review August 24, 2023 14:56

codebien requested review from a team and olegbespalov and removed request for a team August 24, 2023 15:10

mstoykov previously approved these changes Aug 25, 2023

View reviewed changes

olegbespalov previously approved these changes Aug 25, 2023

View reviewed changes

codebien dismissed stale reviews from olegbespalov and mstoykov via 6143178 August 25, 2023 08:30

codebien commented Aug 25, 2023

View reviewed changes

output/cloud/expv2/hdr_test.go Show resolved Hide resolved

codebien requested review from mstoykov and olegbespalov August 25, 2023 10:31

olegbespalov previously approved these changes Aug 25, 2023

View reviewed changes

mstoykov previously approved these changes Aug 25, 2023

View reviewed changes

esquonk reviewed Aug 30, 2023

View reviewed changes

output/cloud/expv2/hdr.go Outdated Show resolved Hide resolved

codebien added 2 commits August 30, 2023 13:05

cloudv2: Higher resolution for Histogram

4715df0

fixup! cloudv2: Higher resolution for Histogram

a6f757d

codebien dismissed stale reviews from mstoykov and olegbespalov via a6f757d August 30, 2023 15:48

codebien force-pushed the cloudv2-hdr-higher-res branch from a7dec90 to a6f757d Compare August 30, 2023 15:48

fixup! fixup! cloudv2: Higher resolution for Histogram

b978d87

codebien requested review from mstoykov and olegbespalov August 30, 2023 16:35

mstoykov approved these changes Aug 31, 2023

View reviewed changes

olegbespalov approved these changes Aug 31, 2023

View reviewed changes

codebien merged commit fd6551d into master Aug 31, 2023
20 checks passed

codebien deleted the cloudv2-hdr-higher-res branch August 31, 2023 10:28

codebien mentioned this pull request Sep 1, 2023

cloud: min_resolution field is required #3317

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cloudv2: Higher resolution for Histograms #3302

cloudv2: Higher resolution for Histograms #3302

codebien commented Aug 24, 2023 •

edited

Loading

CLAassistant commented Aug 24, 2023 •

edited

Loading

codecov-commenter commented Aug 24, 2023 •

edited

Loading

mstoykov left a comment

mstoykov Aug 25, 2023

mstoykov Aug 25, 2023

codebien Aug 25, 2023

esquonk Aug 30, 2023

codebien Aug 30, 2023

olegbespalov left a comment

mstoykov left a comment

mstoykov commented Aug 25, 2023

codebien commented Aug 28, 2023

mstoykov commented Aug 29, 2023 •

edited

Loading

codebien commented Aug 30, 2023 •

edited

Loading

olegbespalov left a comment

olegbespalov Aug 31, 2023

olegbespalov Aug 31, 2023

olegbespalov Aug 31, 2023

olegbespalov Aug 31, 2023

codebien commented Aug 31, 2023 •

edited

Loading

olegbespalov commented Aug 31, 2023

cloudv2: Higher resolution for Histograms #3302

cloudv2: Higher resolution for Histograms #3302

Conversation

codebien commented Aug 24, 2023 • edited Loading

What?

Why?

CLAassistant commented Aug 24, 2023 • edited Loading

codecov-commenter commented Aug 24, 2023 • edited Loading

Codecov Report

mstoykov left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

olegbespalov left a comment

Choose a reason for hiding this comment

mstoykov left a comment

Choose a reason for hiding this comment

mstoykov commented Aug 25, 2023

codebien commented Aug 28, 2023

mstoykov commented Aug 29, 2023 • edited Loading

codebien commented Aug 30, 2023 • edited Loading

olegbespalov left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codebien commented Aug 31, 2023 • edited Loading

olegbespalov commented Aug 31, 2023

codebien commented Aug 24, 2023 •

edited

Loading

CLAassistant commented Aug 24, 2023 •

edited

Loading

codecov-commenter commented Aug 24, 2023 •

edited

Loading

mstoykov commented Aug 29, 2023 •

edited

Loading

codebien commented Aug 30, 2023 •

edited

Loading

codebien commented Aug 31, 2023 •

edited

Loading